30 research outputs found
Capacity of DNA Data Embedding Under Substitution Mutations
A number of methods have been proposed over the last decade for encoding
information using deoxyribonucleic acid (DNA), giving rise to the emerging area
of DNA data embedding. Since a DNA sequence is conceptually equivalent to a
sequence of quaternary symbols (bases), DNA data embedding (diversely called
DNA watermarking or DNA steganography) can be seen as a digital communications
problem where channel errors are tantamount to mutations of DNA bases.
Depending on the use of coding or noncoding DNA hosts, which, respectively,
denote DNA segments that can or cannot be translated into proteins, DNA data
embedding is essentially a problem of communications with or without side
information at the encoder. In this paper the Shannon capacity of DNA data
embedding is obtained for the case in which DNA sequences are subject to
substitution mutations modelled using the Kimura model from molecular evolution
studies. Inferences are also drawn with respect to the biological implications
of some of the results presented.Comment: 22 pages, 13 figures; preliminary versions of this work were
presented at the SPIE Media Forensics and Security XII conference (January
2010) and at the IEEE ICASSP conference (March 2010
On the embedding capacity of DNA strands under insertion, deletion and substitution mutations
Paper presented at Media Forensics and Security XII, SPIE-IS&T Electronic Imaging conference, 18â20 January 2010, San Jose, CaliforniaA number of methods have been proposed over the last decade for embedding information within deoxyribonucleic acid (DNA). Since a DNA sequence is conceptually equivalent to a unidimensional digital signal, DNA data embedding (diversely called DNA watermarking or DNA steganography) can be seen either as a traditional communications problem or as an instance of communications with side information at the encoder, similar to data hiding. These two cases correspond to the use of noncoding or coding DNA hosts, which, respectively, denote DNA segments that cannot or can be translated into proteins. A limitation of existing DNA data embedding methods is that none of them have been designed according to optimal coding principles. It is not possible either to evaluate how close to optimality these methods are without determining the Shannon capacity of DNA data embedding. This is the main topic studied in this paper, where we consider that DNA sequences may be subject to substitution, insertion, and deletion mutations.Science Foundation Irelan
Gene Tagging and the Data Hiding Rate
23nd IET Irish Signals and Systems Conference, Maynooth, Ireland, 28-29th June, 2012We analyze the maximum number of ways in which one can intrinsically tag
a very particular kind of digital asset: a gene, which is just a DNA sequence that encodes
a protein. We consider gene tagging under the most relevant biological constraints:
protein encoding preservation with and without codon count preservation. We show
that our finite and deterministic combinatorial results are asymptoticallyâas the length
of the gene increasesâ particular cases of the stochastic Gelâfand and Pinsker capacity
formula for communications with side information at the encoder, which lies at the
foundations of data hiding theory. This is because gene tagging is a particular case of
DNA watermarking.Science Foundation Irelan
Optimum Exact Histogram Specification
2018 IEEE International Conference on Acoustics, Speech and Signal Processing, Calgary, Alberta, Canada (ICASSP-2018), 15-20 April 2018Exact histogram specification (EHS) is a classic image processing problem which generalises histogram equalisation. Over the years, no optimum solution to the EHS problem has been given with respect to any similarity criterion. An analytic and efficient solution to the optimum EHS problem, according to the mean squared error (MSE) criterion, is presented here. The inverse problem is also examined, and closed-form performance analyses are given in both cases
On the Shannon capacity of DNA data embedding
2010 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Dallas, USA, March 14-19, 2010This paper firstly gives a brief overview of information embedding in deoxyribonucleic acid (DNA) sequences and its applications. DNA data embedding can be considered as a particular case of communications with or without side information, depending on the use of coding or noncoding DNA sequences, respectively. Although several DNA data embedding methods have been proposed over the last decade, it is still an open question to determine the maximum amount of information that can theoretically be embedded - that is, its Shannon capacity. This is the main question tackled in this paper.Science Foundation Irelandti ke SB. 26/7/1
Optimum Reversible Data Hiding and Permutation Coding
7th IEEE International Workshop on Information Forensics and Security (WIFS), Rome, Italy, 16 - 19 November, 2015This paper is mainly devoted to investigating the connection between binary reversible data hiding and permutation coding. We start by undertaking an approximate combinatorial analysis of the embedding capacity of reversible watermarking in the binary Hamming case, which asymptotically shows that optimum reversible watermarking must involve not only 'writing on dirty paper', as in any blind data hiding scenario, but also writing on the dirtiest parts of the paper. The asymptotic analysis leads to the information-theoretical result given by Kalker and Willems more than a decade ago. Furthermore, the novel viewpoint of the problem suggests a near-optimum reversible watermarking algorithm for the low embedding distortion regime based on permutation coding. A practical implementation of permutation coding, previously proposed in the context of maximum-rate perfect steganography of memoryless hosts, can be used to implement the algorithm. The paper concludes with a discussion on the evaluation of the general rate-distortion bound for reversible data hiding.University College Dubli
The role of permutation coding in minimum-distortion perfect counterforensics
39th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May, 2014This paper exploits the connection between minimum-distortion perfect counterforensics and maximum-rate perfect steganography in order to provide the optimum solution to the first of these problems, in the case in which the forensic detector solely uses first order statistics. The solution relies on Slepianâs variant I permutation codes, which had previously been shown to implement maximum
rate perfect steganography when the host is memoryless (equivalently, when the steganographic detector only uses first-order statistics). Additionally, we demonstrate a blind counterforensic strategy made possible by permutation decoding, which may also find application in image processing.Science Foundation IrelandAD 28/04/201
BLIND TURBO DECODING OF SIDE-INFORMED DATA HIDING USING ITERATIVE CHANNEL ESTIMATION
Distortion-Compensated Dither Modulation (DC-DM) has been theoretically shown to be a near-capacity achieving data hiding method, thanks to its use of side information at the encoder. In practice, channel coding is needed to approach its achievable rate limit. However, the most powerful coding methods, such as turbo coding, require knowledge of the channel model. We investigate here the possibility of undertaking blind iterative decoding of DC-DM. To this end, we undertake maximum likelihood estimation of the channel model, intertwining the Expectation-Maximization algorithm within the decoding procedure. 1
Asymptotically Optimum Perfect Universal Steganography of Finite Memoryless Sources
A solution to the problem of asymptotically optimum perfect universal steganography of finite memoryless sources with a passive warden is provided, which is then extended to contemplate a distortion constraint. The solution rests on the fact that Slepianâs Variant I permutation coding implements firstorder perfect universal steganography of finite host signals with optimum embedding rate. The duality between perfect universal steganography with asymptotically optimum embedding rate and lossless universal source coding with asymptotically optimum compression rate is evinced in practice by showing that permutation coding can be implemented by means of adaptive arithmetic coding. Next, a distortion constraint between the host signal and the information-carrying signal is considered. Such a constraint is essential whenever real-world host signals with memory (e.g., images, audio, or video) are decorrelated to conform to the memoryless assumption. The constrained version of the problem requires trading off embedding rate and distortion. Partitioned permutation coding is shown to be a practical way to implement this trade-off, performing close to an unattainable upper bound on the rate-distortion function of the problem.Science Foundation Irelan